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Abstract 

It is shown that the knapsack problem, which was introduced by 
Myasnikov et al. for arbitrary finitely generated groups, can be solved 
in NP for graph groups. This result even holds if the group elements 
are represented in a compressed form by SLPs, which generalizes the 
classical N P-completeness result of the integer knapsack problem. We 
also prove general transfer results: N P-membership of the knapsack 
problem is passed on to finite extensions, HNN-extensions over finite 
associated subgroups, and amalgamated products with finite identified 
subgroups. 


1 Introduction 

In their paper [JD], Myasnikov, Nikolaev, and Ushakov started the investi¬ 
gation of classical discrete optimization problems, which are classically for¬ 
mulated over the integers, for arbitrary in general non-commutative groups. 
Among other problems, they introduced for a finitely generated group G the 
knapsack problem and the subset sum problem. The input for the knapsack 
problem is a sequence of group elements gi,..., pk, g € G (specified by finite 
words over the generators of G) and it is asked whether there exists a solu¬ 
tion (xi,..., Xk) G of the equation g^^ ■ ■ ■ g^'° = g. For the subset sum 
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problem one restricts the solution to {0,1}^. For the particular case G = Z 
(where the additive notation xi • 51 + • • • + = <7 is usually preferred) 

these problems are NP-complete if the numbers gi,..., gk, g are encoded in 
binary representation. For subset sum, this is a classical result from Karp’s 
seminal paper [2S] on NP-completeness. Knapsack for integers is usually 
formulated in a more general form in the literature; NP-completeness of 
the above form (for binary encoded int^ers) was shown in [18], where the 
problem was called multisubset sum)]^ Interestingly, if we consider sub¬ 
set sum for the group G = Z, but encode the input numbers gi, ■ ■ ■ ,gk-,g 
in unary notation, then the problem is in DLOGTIME-uniform TC® (a small 
subclass of polynomial time and even of logarithmic space that captures the 
complexity of multiplication of binary encoded numbers) [T5| , and the same 
holds for knapsack, since the instance xi ■ gi + ■ ■ ■ + Xk ■ gk = 9 has a solution 
if and only if it has a solution with Xi < k ■ (max{( 7 i,... ,gk,g})^ IS]- This 
allows to reduce unary knapsack to unary subset sum. See [ 22 ] for related 
results. 

In [lO] the authors encode elements of the hnitely generated group G 
by words over the group generators and their inverses. For G = Z this 
representation corresponds to the unary encoding of integers. Among others, 
the following results were shown in [40j : 

• Subset sum and knapsack can be solved in polynomial time for every 
hyperbolic group. 

• Subset sum for a virtually nilpotent group (a finite extension of a 
nilpotent group) can be solved in polynomial time. 

• For the following groups, subset sum is N P-complete (whereas the 
word problem can be solved in polynomial time): free metabelian non- 
abelian groups of finite rank, the wreath product Z I Z, Thompson’s 
group F, and the Baumslag-Solitar group BS(1,2). 

Further results on knapsack and subset sum have been recently obtained in 

m- 

• For a virtually nilpotent group, subset sum belongs to NL (nondeter- 
ministic logspace). 

^Note that if we ask for a solution (2:1,... ,Xk) in then knapsack can be solved in 
polynomial time (even for binary encoded integers) by checking whether gcd(gii, ..., gk) 
divides g. 
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• There is a nilpotent group of class 2 (in fact, a direct product of 

sufficiently many copies of the discrete Heisenberg group for 

which knapsack is undecidable. 

• The knapsack problem for the discrete Heisenberg group is de¬ 

cidable. In particular, together with the previous point it follows that 
decidability of knapsack is not preserved under direct products. 

• There is a polycyclic group with an NP-complete subset sum problem. 

• The knapsack problem is decidable for all co-context-free groups. 

The focus of this paper will be on the knapsack problem. We will prove 
that this problem can be solved in NP for every graph group. Graph groups 
are also known as right-angled Artin groups or free partially commutative 
groups. A graph group is specified by a finite simple graph. The vertices 
are the generators of the group, and two generators a and b are allowed 
to commute if and only if a and b are adjacent. Graph groups somehow 
interpolate between free groups and free abelian groups and can be seen as 
a group counterpart of trace monoids (free partially commutative monoids) 
that have been used for the specification of concurrent behavior. In combina¬ 
torial group theory, graph groups are currently a hot topic, mainly because 
of their rich subgroup structure laEiiiT]. To prove that knapsack belongs 
to NP for a graph group, we proceed in two steps: 

• We show that if an instance g^^ ■ ■ ■ g^'° = g has a solution in a graph 
group, then it has a solution, where every Xi is bounded exponentially 
in the input length (the total length of all words representing the group 
elements gi,..., g^, g). 

• We then guess the binary encodings of numbers ni,... ,nk that are 
bounded by the exponential bound from the previous point and verify 
in polynomial time the identity ■ ■ ■ 5 ^* = g. The latter problem 
is an instance of the so called compressed word problem for a graph 
group. This is the classical word problem, where the input group 
element is given succinctly by a so called straight-line program (SLP), 
which is a context-free grammar that produces a single word (here, a 
word over the group generators and their inverses). An SLP with n 
productions in Chomsky normal form can produce a string of length 
2”. It has been shown in |32] that the compressed word problem for a 
graph group can be solved in polynomial time, see also m for more 
details. 
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In fact, our proof yields a stronger result: First, it yields an NP procedure 
for solving knapsack-like equations hQg^^hi ■ ■ ■ = 1, where some of 

the variables xi,..., Xfc are allowed to be identical. We call such an equation 
an exponent equation. Hence, we prove that solvability of exponent equations 
over a graph group belongs to NP. 

Second, we show that the latter result even holds, when the group ele¬ 
ments gi,... ,gk,ho,... ,hk are given succinctly by SLPs; we speak of solv¬ 
ability of compressed exponent equations. This is interesting, since the SLP- 
encoding of group elements corresponds in the case G = Z to the binary 
encoding of integers. Hence, membership in NP for solvability of compressed 
exponent equations over a graph group generalizes the classical NP mem¬ 
bership for knapsack (over Z) to a much wider class of groups. 

Furthermore, we extend the class of groups for which solvability of knap¬ 
sack (resp. compressed exponent equations) can be checked in NP by prov¬ 
ing general transfer results. Our first transfer result states that if H is 
a finite extension of G and solvability of compressed exponent equations 
(or knapsack) can be checked in NP for G, then the same holds for H. 
This provides such algorithms for the large class of virtually special groups. 
These are finite extensions of subgroups of graph groups. Virtually special 
groups recently played a major role in a spectacular breakthrough in three- 
dimensional topology, namely the solution of the virtual Haken conjecture 
[I] . In the course of this development it turned out that the class of virtually 
special groups is extremely rich: It contains Coxeter groups m, one-relator 
groups with torsion [45], fully residually free groups [45|, and fundamental 
groups of hyperbolic 3-manifolds [T] . 

We also prove transfer results for HNN-extensions and amalgamated 
products with hnite associated (resp. identified) subgroups in the case of the 
knapsack problem. Such HNN-extensions and amalgamated products play 
a fundamental role in combinatorial group theory [36]. For example, they 
appear in Stallings’ decomposition of groups with more than one end [42] 
and in the construction of virtually free groups jinj . Furthermore, they are 
known to preserve a wide variety of structural and algorithmic properties 
(see Section [9]). 

A side product of our proof is that the set of all solutions (xi,..., x^) G 

of an exponent equation gf^ ■ ■ ■ g^^ = g over a graph group is semilinear, 
and a semilinear representation can be produced effectively. This seems to 
be true for many groups, e.g., for all co-context-free groups [28]. On the 
other hand, the discrete Heisenberg group is an example of a group 

for which solvability of exponent equations is decidable but the set of all 
solutions of an exponent equation is not semilinear; it is dehned by a single 
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quadratic Diophantine equation [28] . 

Finally, we complement our upper bounds by a new lower bound: Knap¬ 
sack and subset sum are both NP-complete for a direct product of two free 
groups of rank two {F 2 x F 2 ). This group is the graph group correspond¬ 
ing to a cycle of length four. NP-hardness already holds for the case that 
the input group elements are specified by words over the generators (for 
SLP-compressed words, NP-hardness already holds for Z) and the exponent 
variables are allowed to take values in Z (instead N). N P-completeness of 
subset sum for F 2 x F 2 solves an open problem from |16] . 

Related work. The knapsack problem is a special case of the more general 
rational subset membership problem. A rational subset of a finitely generated 
monoid M is the homomorphic image in M of a regular language over the 
generators of M. In the rational subset membership problem for M the 
input consists of a rational subset L C M (specified by a finite automaton) 
and an element m G M and it is asked whether m € T. It was shown in 
|35j that the rational subset membership problem for a graph group G is 
decidable if and only if the corresponding graph has (i) no induced cycle 
on four nodes (C4) and (ii) no induced path on four nodes (P4). For the 
decidable cases, the precise complexity is open. 

Knapsack for G can be also viewed as the question, whether a word equa¬ 
tion X 1 X 2 ■ ■ ■ Xn = 1, where Xi,... ,Xn are variables, together with con¬ 
straints of the form {g'^ | n. > 0} for the variables has a solution in G. Such 
a solution is a mapping ip : {Xi, ..., —> G such that ip{XiX 2 ■ ■ ■ Xn) 

evaluates to I in G and all constraints are satisfied. For another class of 
constraints (so called normalized rational constraints, which do not cover 
constraints of the form {g'^ \ n > 0 }), solvability of general word equations 
was shown to be decidable (PSPACE-complete) for graph groups by Diekert 
and Muscholl m- This result was extended in |12] to a transfer theorem 
for graph products. A graph product is specified by a finite simple graph, 
where every node is labelled with a group. The associated group is obtained 
from the free product of all vertex groups by allowing elements from adja¬ 
cent groups to commute. Note that decidability of knapsack is not preserved 
under graph products. It is even not preserved under direct products, see 
the above mentioned results from [28] . 


5 


2 Words and straight-line programs 

For a word w we denote with alph(ri;) the set of symbols occurring in w. 
The length of the word w is \w\. 

A straight-line program, briefly SLP, is basically a context-free grammar 
that produces exactly one string. To ensure this, the grammar has to be 
acyclic and deterministic (every variable has a unique production where it 
occurs on the left-hand side). Formally, an SLP is a tuple Q = rhs, S), 

where F is a finite set of variables (or nonterminals), S is the terminal 
alphabet, 5* G F is the start variable, and rhs maps every variable to a 
right-hand side rhs(A) G (F U S)*. We require that there is a linear order 
< on F such that B < A, whenever B N r\ alph(rhs(j4)). Every variable 
A £ V derives to a unique string valg(A) by iteratively replacing variables 
by the corresponding right-hand sides, starting with A. Finally, the string 
derived by Q is val(^) = valg(5). 

Let Q = (F, S,rhs, S') be an SLP. The size of Q is \G\ = 
i.e., the total length of all right-hand sides. A simple induction shows that 
for every SLP Q of size m one has |val(t/)| < 0(3™/^) C [8l proof 

of Lemma 1]. On the other hand, it is straightforward to dehne an SLP 
Ti of size 2n such that |val(?^)| > 2"'. This justihes to see an SLP ^ as a 
compressed representation of the string val(fy), and exponential compression 
rates can be achieved in this way. More details on SLPs can be found in the 
survey [30] . 

3 Knapsack and exponent equations 

We assume that the reader has some basic knowledge concerning (hnitely 
generated) groups, see e.g. |36| for further details. Let G be a finitely 
generated group, and let A be a finite generating set for G. Then, elements 
of G can be represented by hnite words over the alphabet A^^ = A LI A~^. 

An exponent equation over G is an equation of the form 

Voul^ViU2'^V2 ■ ■ ■ U^"Vn = 1 , 

where ui,U 2 , ■ ■ ■ ,Un,vo,vi,... ,Vn G G are group elements that are given 
by finite words over the alphabet and xi,X 2 , ■ ■ ■ ,Xn are not necessarily 
distinct variables. Such an exponent equation is solvable if there exists a 
mapping a : {xi ,..., Xn} N such that = 1 

in the group G. Solvability of exponent equations over G is the following 
computational problem: 
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Input: An exponent equation E over G (where elements of G are specified 
by words over the group generators and their inverses). 

Question: Is E solvable? 

The knapsack problem for the group G is the restriction of solvability of expo¬ 
nent equations over G to exponent equations of the form ■ ■ ■ Un"u~^ = 

1 , or, equivalently = u, where the exponent variables xi,... ,Xn 

have to be pairwise different. 

We will also study a compressed version of exponent equations over G, 
where elements of G are given by SLPs over A compressed exponent 

equation is an exponent equation vqu\'^viU 2 ^V 2 -■ ■ u^"Vn = 1, where the 
group elements ui, U2, ..., uq, ui,..., G G are given by SLPs over the 
terminal alphabet . The sum of the sizes of these SLPs is the size of the 
compressed exponent equation. 

Let us define solvability of compressed exponent equations over G as the 
following computational problem; 
luput: A compressed exponent equation E over G. 

Question: Is E solvable? 

The compressed knapsack problem for G is defined analogously. Note that 
with this terminology, the classical knapsack problem for binary encoded 
integers is the compressed knapsack problem for the group Z. The binary 
encoding of an integer can be easily transformed into an SLP over the al¬ 
phabet {a, a“^} (where a is a generator of Z) and vice versa. Thereby the 
number of bits in the binary encoding and the size of the SLP are linearly 
related. 

It is a simple observation that the decidability and complexity of solvabil¬ 
ity of (compressed) exponent equations over G as well as the (compressed) 
knapsack problem for G does not depend on the chosen finite generating set 
for the group G. Therefore, we do not have to mention the generating set 
explicitly in these problems. 

Remark 1. Since we are dealing with a group, one might also allow solution 
mappings a : {xi,..., x^} h to the integers. But this variant of solvability 
of (compressed) exponent equations (knapsack, respectively) can be reduced 
to the above version, where a maps to N, by simply replacing a power uf* 
by uP(tt“^)^% where yi is a fresh variable. 

The goal of this paper is to prove the decidability of solvability of expo¬ 
nent equations for so called graph groups. We actually prove that solvability 
of compressed exponent equations for a graph group belongs to NP. Graph 
groups will be introduced in the next section. 
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4 Traces and graph groups 


Let {A, I) be a finite simple graph. In other words, the edge relation 
I C ^ X A is irreflexive and symmetric. It is also called the independence 
relation, and {A, I) is called an independence alphabet. We consider the 
monoid M(4,1) = A*l=i, where =/ is the smallest congruence relation 
on the free monoid A* that contains all pairs {ab, ba) with a,b ^ A and 
(a, b) € I. This monoid is called a trace monoid or partially commutative 
free monoid. Elements of M(A, I) are called Mazurkiewicz traces or simply 
traces. The trace represented by the word u is denoted by [u\j, or sim¬ 
ply u if no confusion can arise. For a language L C A* we denote with 
[L]j = {u € A* \ 3v € L : u =i v} its partially commutative closure. The 
length of the trace [u]i is Ku]/! = |u| and its alphabet is alph([u]/) = alph(u). 
It is easy to see that these definition do not depend on the concrete word 
that represents the trace [n]/. For subsets B,C C ^4 we write BIC for 
B X C I. B = {o} we simply write alC. For traces s, t we write sit for 
alph(s)/alph(f). The empty trace [e]/ is the identity element of the monoid 
M(A,/) and is denoted by 1. A trace t is connected if we cannot factorize t 
as t = uv with u ^ 1 ^ v and ulv. 

A trace t € M(A,/) can be visualized by its dependence graph Dt. To 
define Dt, choose an arbitrary word w = 0102 • • • a„, at G A, with t = [rc]/ 
and define Dt = ({1,..., n}, F, A), where E = {{i,j) \ i < j,{ai,aj) € D} 
and A(z) = 0 *. If we identify isomorphic dependence graphs, then this 
definition is independent of the chosen word representing t. Moreover, the 
mapping t Dt is injective. As a consequence of the representation of 
traces by dependence graphs, one obtains Levi’s lemma for traces, see e.g. 
m p. 74 ], which is one of the fundamental facts in trace theory. The formal 
statement is as follows. 

Lemma 2. Let ui,... ,Um,vi,... ,Vn & M.{A, I). Then 

U 1 U 2 ■■■Um = V 1 V 2 ■■■Vn 

if and only if there exist Wij G M(A, I) {1 < i < m, 1 < j < n) such that 

• Ui = Wi^iWi^2 • • • Wi,n for every l<i <m, 

• Vj = W 1 JW 2 J • • • Wm,j for every 1 < j <n, and 

• {wij,Wk/) Gl if l<i<k<m and n > j > £ > 1. 

The situation in the lemma will be visualized by a diagram of the following 
kind. The i-th column corresponds to ut, the j-th row corresponds to Vj, 
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and the intersection of the i-th column and the j-th row represents Wij. 
Furthermore Wij and Wk/ are independent if one of them is left-above the 
other one. 


Vn 

Wl^n 

W2,n 

W3,n 


'^m,n 







V3 

'Wl,3 

W2,3 

W3,3 


Wm.,3 

V2 

Wl,2 

W2,2 

W3,2 


Wm.,2 

Vi 

Wl,l 

W2,l 

W3,l 


Wm.,1 


Ui 

U2 

U3 


Um 


A consequence of Levi’s Lemma is that trace monoids are cancellative, i.e., 
usv = utv implies s = t for all traces s, t,u,v G M(A, I). 

For a trace u G M(A, I) let p{u) be the number of prefixes of u. We will 
use the following statement from [3]. 

Lemma 3. Let u G M(A,/) be a trace of length n. Then p{u) G 0(n“), 
where a is the size of a largest clique of the complementary graph {A, ly = 
(A, {AxA)\I). 

With an independence alphabet {A, I) we associate the group 
G(A, I) = {A\ab = ba {{a, b) G /)). 

Such a group is called a graph group, or right-angled Artin groujl, or free par¬ 
tially commutative group. Here, we use the term graph group. Graph groups 
received a lot of attention in group theory during the last few years, mainly 
due to their rich subgroup structure laEittZj, and their relationship to 
low dimensional topology (via so called virtually special groups) [T|IT9| 135]. 
We represent elements of G(A, /) by traces over an extended independence 
alphabet. For this, let A~^ = {a“^ | a G A} be a disjoint copy of the al¬ 
phabet A, and let A^^ = A U A~^. We define = a and for a word 

w = 0102 • • • On, with Oj G A^^ we define w~^ = • • • o^^oj"^. This defines 

an involution (without fixed points) on (A^^)*. We extend the independence 
relation I to A^^ by (o’”, b^) G I for all (o, b) & I and x, y G {—1,1}. Then, 
there is a canonical surjective morphism h : M(A^^, /) —>■ G(A, I) that maps 
every symbol o G A^^ to the corresponding group element. Of course, h is 
not injective, but we can easily define a subset IRR(A=*=^,/) C M(A=*=^,/) of 
irreducible traces such that h restricted to IRR(A=*=^,/) is bijective. The set 

^This term comes from the fact that right-angled Artin groups are exactly the Artin 
groups corresponding to right-angled Coxeter groups. 
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IRR(A^^, /) consists of all traces t G /) such that t does not contain 

a factor [aa~^]i with a G i.e., there do not exist rt,n G and 

a G such that in M(j 4^^,/) we have a factorization t = u[aa~^]jv. For 
every trace t there exists a corresponding irreducible normal form that is ob¬ 
tained by removing from t factors [aa“^]/ with a G as long as possible. 
R can be shown that this reduction process is terminating (which is trivial 
since it reduces the length) and confluent (in [29] a more general confluence 
lemma for graph products of monoids is shown). Hence, the irreducible nor¬ 
mal form of t does not depend on the concrete order of reduction steps. For 
a group element g G G(^,/) we denote with \g\ the length of the unique 
trace t G IRR(A^^,/) such that h{t) = g. 

For a trace t = [u]i {u G (^4^^)*) we can define t~^ = This is 

well-defined, since u =j v implies u~^ =/ v~^. The following lemma will be 
important, see m Lemma 23]: 

Lemma 4. Let s,t ^ IRR(74^^,/). Then there exist unique factorizations 
s = up, t = p~^v such that uv G IRR(A^^,/). Hence, uv is the irreducible 
normal form of st. 

5 Factorizations of powers 

Based on Levi’s lemma we prove in this section a factorization result for 
powers of a connected trace. We start with the case that we factorize such 
a power into two factors. 

Lemma 5. Let u G M(A, I) \ {1} be a connected trace. Then, for all x 
and all traces yi,y 2 the following two statements are equivalent: 

(i) 

(a) There exist /,A:,c G N and traces s,p such that: yi = u^s, y 2 = pu^, 
sp = u^, I + k + c = X, and c < |^4]. 

Proof. That (ii) implies (i) is clear. R remains to prove that (i) implies (ii). 
Assume that = yiy 2 holds. The case that x < |A| is trivial. Hence, 
assume that x > |A| -|- 1. We apply Levi’s lemma (Lemma[2|) to the identity 
w " = 2/12/2: 


2/2 

Wl,2 

U2,2 

U3,2 

U4,2 


Ux-1,2 

'^x,2 

2/1 

Wl,l 

U2,l 

U3,l 

ua,i 


1,1 



u 

U 

U 

U 


u 

u 
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Let Ai = alph(ui ^2 • • • Ui, 2 )- Then Ai C Aj+i. If = 0 then ui ^2 = 1 and we 
can go to Case 2 below. Otherwise, assume that Ai ^ 0. In that case there 
must exist 1 <i <\A\ such that Ai = Ai^i, which implies alph(uj+i^ 2 ) ^ Ai. 
Since ni+i^i/(ni ^2 • • •'^ 1 , 2 ) we also have Ui+i^ilui+i^ 2 - Since u is connected, 
we have Uj+ 1^1 = 1 or Uj+ 1^2 = 1- We can therefore distinguish the following 
two cases: 

Case 1. There exists 1 < i < |A| + 1 such that Ui^i = I. Then Ui ^2 = u, 
which implies Uj^i = 1 for all j > i (since Ui^ 2 luj,i)'- 


2/2 

171,2 

U2,2 


17i-l,2 

U 

u 


u 

u 

2/1 

171,1 

U2,l 


17i-l,l 

1 

1 


1 

1 


U 

U 


U 

U 

u 


u 

u 


Let s = ui,iU2,i • • • and p = ui,2'a2,2 • • • 'ai-i,2- Thus, yi = u^s, 2/2 = 

py^x-i+i gp _ yi-i 1 — 1 ^ |^|, uud the couclusiou of the lemma 
holds. 

Case 2. There exists 1 < i < |^| + 1 such that Ui ^2 = 1- Then, Uj ^2 = 1 for 
all j < i (since Uip = u and Uj^ 2 lui,i)- 


2/2 

1 

1 


1 

1 

Ui+1,2 



'^x,2 

2/1 

u 

u 


u 

u 

lti+1,1 


1,1 

'^x,! 


u 

u 


u 

u 

u 


u 

u 


Let y[ = Ui+i^i ■ ■ ■ Ux,i. Hence, = y'iy 2 - We can use induction to 

get factorizations y[ = u^'s, 7/2 = pu^, and sp = u'^ with c < |H| and 
k + l + c = x — i. Finally, we have yi = which shows the 

conclusion of the lemma. □ 

Now we lift Lemma [5] to an arbitrary number of factors. 

Lemma 6. Let u € M(H,/) \ {1} be a connected trace and m € N, m > 2. 
Then, for all x G N and traces yi,... ,2/m the following two statements are 
equivalent: 

(i) = y^y 2 ■■■ym- 

(a) There exist traces pij {1 < j < i < m), Sj (1 < i < m) and numbers 
Xj, Cj G N (1 < 7 < m, 1 < j < m — 1) such that: 

• Vi = {IVjL\Pi,j)u''"Si for all 1 < i < m, 

• Pi,jlpk,i if j <l <k <i and pijI{u^'^Sk) if j < k < i 

• Sm = 1 and for all I < j < m, Sj 02=/+! Pi,j — 
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• Cj < 1^1 for all 1 < j < m — 1, 

E m I 

i=lXi + } 2 i=l Ci. 

Proof. Let us first show that (ii) implies (i). Assume that (ii) holds. Then 
we get 

m / 2—1 
2 / 12/2 • • • 2/m = n ( (H 

i=l ^ j=l 

The independencies Pi,jlpk,i ior j < I < k < i and pijl{u^'^Sk) for j < A: < i 
yield 

m / 2—1 \ 

i=l ^ j = l 2 

= U*1S1P2,1 ■ ■ ■ Pmpu'^'^ S2P3,2 ' ' ' Pm., 2 vf"^ S 3 ■ ■ ■ Sm-lPm,m-lu''”" Sm 

= = vP. 

We now prove that (i) implies (ii) by induction on m. So, assume that 
= 2 / 12/2 ••• 2/m- The case m = 2 follows directly from Lemma [5l Now 
assume that m > 3. By Lemma [5] there exist factorizations yi = u^^si, 
2/2 • • • 2/m = Piu^ , and sipi = with ci < |A| and xi + x' + ci = x. Levi’s 
lemma applied to 1/2 ■ ■ ■ 2/m = Piu^ gives the following diagram: 


Vm 

Pm,l 

2/m 




2/3 

P3,l 

2/3 

2/2 

P2,l 

2/2 


Pi 

U 1 U 1 U 1 . . . 

1 U 1 u 


There exist y[ with yi = pi^iy[ {2 < i < m), 2 / 2 ' " 2/m = , and y'jlpi^i for 

j < i. By induction on m we get factorizations 

2—1 

y'i = '[\pi,ju''"si 

i=2 

for 2 < i < m such that for all 2 < j < f < m: 

• Pi,jIPk,i if j < / < A: < f and pijI{u^^Sk) ii j <k <i, 

• Sm = 1 and for all 2 < j < m, Sj Y\^j-\.iPi,j = for some Cj < |A|, 

• = TT=2^^ + TZ-2 
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Since y'jlpi^i for j < i we get Pi,ilpj,k for 1 < A: < j < i and piplu^^ Sj for 
1 < j < i. Finally, we have 

m 

i=2 

and 

m m—1 m m—1 

X = xi+Cl+= xi + Cl + y~] Xj + y~] Ci = y~] Xi + y~] Q. 

i=2 i=2 i=l i=l 

This proves the lemma. □ 

Remark 7. In Section 0 we will apply Lemma 0 in order to replaee an 
equation = yiy 2 ■ ■ ■ ym, (where x,yi,... ,ym are variables and u is a con¬ 
crete connected trace) by an equivalent disjunction. Note that the length of 
all factors pij and Si above is bounded by |j 4| ■ |u|. Hence, one can guess 
these traces as well as the numbers Cj < |^| (the guess results in a big dis¬ 
junction). We can also guess which of the numbers Xj are zero and which 
are greater than zero. After these guesses we can verify the independences 
Pi,jlpk,i (j < I < k < i) and pijI{u^^Sk) (j < k < i), and the identities 
I? =j+iPi,j = (^ < j < rn). If one of them does not hold, 

the specific guess does not contribute to the disjunction. In this way, we can 
replace the equation = 7 / 17/2 ■ ■ ■ Vm by a big disjunction of formulas of the 
form 

m 

3xj > 0 (i G iF) : X = ^ Xi + c A f\ Vi = PiN^'Si A /\ yi = piSi, 
i&K iaK i&[l,m\\K 

where K C c < |A |'(777 — 1 ) and thepi,Si are concrete traces of length 

at most |A| • (?77 — 1) • |77|. The number of disjuncts in the disjunction will 
not be important for our purpose. 

6 Automata for partially commutative closures 

In this section, we present several automata constructions that are well- 
known from the theory of recognizable trace languages m Chapter 2], For 
our purpose we need upper bounds on the size (the size of an automaton is its 
number of states) of the constructed automata. In our specihc situation we 
can obtain better bounds than those obtained from the known constructions. 
Therefore, we present the constructions in detail. 
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Let us fix an independence alphabet (A, I) and let A = {Q, A, A, qq, F) 
be a nondeterministic finite automaton (NFA) over the alphabet A, where 
ACQxAxQis the transition relation, go £ Q is the initial state and 
F <Z Q is the set of final states. Then, A is an /-diamond NFA if for all 
(a, 6 ) € I and all transitions {p,a,q), G A there exists a state q' 

such that {p,b,Q'), {q', ci,r) G A. For an /-diamond automaton we have 
L{A) = [L{A)]j. The NFA A is memorizing if (i) every state is accessible 
from the initial state go and (ii) there is a mapping a : Q ^ 2^ such that 
for every word rc G A*, if go Q: then a(g) = alph(rt;). 

Lemma 8. Let Ai and A 2 be I-diamond NFA and let Ui be the number of 
states of Ai- Assume that A 2 is memorizing. Then there exists an I-dia¬ 
mond NFA for [L{Ai)L{A 2 )]i with ni ■ n 2 many states. 

Proof. Let Ai = {Qi,A,Ai,qQ^i,Fi) for i G {1,2}. Let a 2 ■ Q 2 ^ 2"^ be the 
map witnessing the fact that A 2 is memorizing. Then, let 

A = {Qi X < 52 , A, A, (go,i, go, 2 ), Fi x F 2 ), 


where 


A = {{{pi,p2),a,{qi,p2)) \ {pi,a,qi) e Ai,ala2{p2)}^ 
{{{Pi,P2),a,{pi,q2)) I {P2,a,q2) G A 2 }. 

This indeed defines an /-diamond NFA. 

We show that the following two statements are equivalent for all tc G A*, 
Pi G Qi, and p 2 G Q 2 : 

(i) (^ 0 , 1 , 90 , 2 ) 

(ii) There are wi,W 2 G A* such that w =/ W 1 W 2 , go,i -^Ai Pii aad 

W2 

Q0,2 - >A2 P2- 

This clearly implies that L{A) = [L(Ai)L(A 2 )]/. 

Let us hrst prove that (i) implies (ii). The case w = £ is clear. Hence, 
let w = w'a. Then there exist p} G Qi, p '2 G Q 2 such that 

(^0, 1 , 90 , 2 ) ^A {p[,P2) -^A {Pl,P2)- 

By induction, there exists a factorization w' =7 w[w 2 such that go,i —p} 

w' 

and go ,2 — P' 2 - Note that alph(t(; 2 ) = a 2 (/' 2 )- There are two cases: 
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Case 1. p'l -^Ai Pii P 2 = P 2 ) ala 2 {p 2 )- Thus, alw 2 - We get w = 

w'a =/ w'iw' 2 a =/ {w'ia)w 2 - Let wi = w[a and W 2 = We get ( 70,1 
Pi and go ,2 ^A 2 P2- 

Case 2. p '2 -^A 2 P 2 and pi = p[. Let wi = w[ and W 2 = Thus, w = 
w'a =i w'iw' 2 a = wiW 2 - Moreover, we have go,i Pi and go ,2 -^A 2 P 2 - 

Let us now prove that (ii) implies (i). Assume that w =/ W 1 W 2 , go,i -^A^ 
pi, and go ,2 ^A 2 P 2 - We have to show that (go,i,go, 2 ) -^A {Pi,P 2 )- But 
since A is an /-diamond NFA, it suffices to show that (go,i,go, 2 ) 

{pi,P 2 ), which follows directly from the assumption and the definition of A 
(note that a 2 (go, 2 ) = 0)- This concludes the proof of the lemma. □ 

In general, for a regular language L C A*, the partially commutative closure 
[L]j is not regular. For instance, if A = {a, 6} and alb, then [{ab)*]i consists 
of all words with the same number of a’s as 6’s. On the other hand, it 
is well known that if u is a connected trace, then [u*]i is regular (a more 
general result, known as Ochmanski’s theorem holds in fact, see e.g. m 
Section 2.3]). For our purpose we need an upper on the size of an /-diamond 
NFA for [u*]i (with u connected). Recall that p{u) is the number of different 
prefixes of the trace u. 

Lemma 9. Let u € M(A,/) \ {1} be connected. There is a memorizing 
I-diamond NFA for [tt*]/ of size 2 • 

Proof. The following construction can be found in |39l Proposition 5] for 
the more general case of the partially commutative closure of a so called 
loop-connected automaton. We present the construction in our simplified 
situation, since the NFA gets slightly smaller. 

We first define a non-memorizing /-diamond NFA A for [u*]j of size 
p{u)'^^'^. Then, we show that by adding an additional bit to all states, we 
can get a memorizing /-diamond NFA A for [w*]/ of size 2- p{u)^^'^. The idea 
for the construction of A is implicitly contained in the proof of Lemma (5) 
Assume that the automaton wants to read a word of the form w" and a 
prefix yi is already read. Then yi must be of the form vf's, where s is a 
prefix of u'' for some c < |A|. The prefix s must be of the form uiU 2 - ■ ■ Uc 
such that if « = UiVi, then Viluj if i < j. The state of the NFA stores the 
tuple (tti,n 2 ,... ,Uc). 

Define A = {Q, A, A, go, F), where Q is the set of all tuples {ui,U 2 ,..., ttc) 
of traces such that there exist vi,...,Vc € M(A,/) with u = UiVi (since 
M(A,/) is cancellative, the Vi are uniquely determined by the Ui), Ui 7 ^ 1, 
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Vi 7 ^ 1, and viluj if i < j. Note that we must have c < \A\-. If c > 1 ^ 1 , 
then there exist i < |yl| such that alph(ui ■ ■ -Vi) = alph(ui • • • Ui+i). Hence, 
alph(ui+i) C alph(ui • • • Uj). Since {vi ■ ■ ■ Vi)Iui+i we get Ui+ilvi+i which 
contradicts ttj+i 7 ^ 1 7 ^ Ui+i and the fact that u is connected. 

Since 7 ^ 1 for all i, we can encode a state {ui,U 2 , ■ ■ ■, Uc) € Q by the 
tuple (ui,U 2 ,..., ttc, 1, • • •, 1) of length |H|. This implies that the number 
of states of A is bounded by p{u)^^^. Note that if |u| = 1 , then the empty 
tuple 0 is the only state. 

The transitions of A are defined as follows, where (tti, U 2 ,..., Uc) G Q- 

(a) 0 -^A 0 if u = a G H, 

(b) {ui,U 2 ,..., Uc) -^A {u 2 , ■ ■ ■, Uc) if c > 0 , ttia = u and al{u 2 ■ ■ ■ Uc), 

(c) {ui,U 2 ,... ,Uc) -^A (ui,... ,Ui,a,Ui+i,... ,Uc) if al{ui+i ■ ■ ■ Uc) and 

(ui , ,Ui,(l, Ui-i-l,..., Uc) G Q , 

(d) {ui,U2,... ,Uc) -^A {ui,... ,Ui-i,Uia,Ui+i,... ,Uc) if a/(«i+i ■ ■ ■ Uc) and 
(Ul, . . . , Ui—i, UiCL, Ui^i , . . . , Uc) G Q. 

The initial state as well as the final state is the empty tuple (). It is easy to 
check that this is indeed an /-diamond NFA. 

We claim that for every state (tti,... ,Uc) G Q and every w G A* the 
following two statements are equivalent (which shows that L{A) = [u*]/): 

(i) ()^^(ui,...,Ue) 

(ii) w =j u^ui ■ ■ - Uc for some k > 0 

Let us first show by induction on |t(;| that (i) implies (ii). The case w = s is 
clear. So, assume that w = w'a. There must exist a state {u'l,... ,u'^) G Q 
such that 

0 u',) ^A {m,...,Uc). 

By induction, we get w' =7 u^u'i ■ ■ - u'^ for some / > 0. The definition of the 
transitions of A implies that w = w'a =/ u^u'i ■ ■ ■ u'^a =/ u^ui ■ ■ - Uc, where 
k G {/,/ + !}. 

For the direction from (ii) to (i) assume that w =/ u^ui ■ ■ - Uc for some 
k > 0. We have to show that () -^a ■ ■ ■, Uc). Since A is an /-diamond 

k 

NFA, it suffices to show that () (tti,..., Uc). But this follows 

directly from the definition of A. 

To make A memorizing, we first keep only those states that are accessible 
from the initial state (). Then, we add an extra bit to every state that indi¬ 
cates whether we have already seen a completed occurrence of u. Thus, the 
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new set of states is Q x { 0 , 1 }, the initial state is the pair ((), 0 ), and the final 
states are ((), 0) and ((), 1). The transitions operate on the Q-component as 
for A. The {0, l}-component is copied except for a transition {q,a,p) G A 
of type ( 6 ). This transition gives us the transitions ((g, 0), a, (p, 1)) and 
((q, l),a, {p, 1)). Then, we can define the a-mapping by 

C 

. .,Uc),i) = U alph(uj) U alph(u*). 

The resulting NFA is still an /-diamond NFA. □ 

A direct consequence of Lemma [ 8 ] and [9] is; 

Lemma 10. Let p,u,s G M(A, I) with u ^ 1 connected. There is an NFA 
for \pu*s]i of size 2 • p{p) ■ p{s) ■ 

Proof. We first construct an /-diamond NFA for p (which is identified here 
with the set of words {u G A* \ p = [u]/}) with p{p) many states by taking 
the set of all prefixes of p as states. Then, we construct a memorizing I- 
diamond NFA for [u*]j with 2 • states using Lemma[9j By Lemma[ 8 ] 

we get an /-diamond automaton for \pu*]i with 2 • p{p) ■ many states. 

Finally, we construct an /-diamond NFA for s with p{s) many states by 
taking the set of all prefixes of s as states. This NFA is also memorizing. 
Hence, we can apply Lemma [ 8 ] to get an NFA for \pu*s]i with 2 • p{p) ■ p{s) ■ 
many states. □ 

The main lemma from this section that will be needed later is: 

Lemma 11. Let p,q,u,v,s,t G M(A,/) with u ^ 1 and v A ^ connected. 
Let m = iiiax{p{p), p{q), p{s), p{t)} and n = max{/ 3 (u), p(u)}. Then the set 

L{p, u, s, q, V, t) := {{x, y) € N x N | pu^s = qv'^t} 

is semilinear and is a union of 0{nrfi ■ many linear sets of the form 

{(a + bz,c + dz) I z G N} with a, b,c,d ^ 0{m^ ■ 

Proof. By Lemma [9] there exists an NFA for [pu*s]i of size 

k = 2 ■ p{p) ■ p{s) ■ <2 - 

and an NFA for [qv*t]i of size 

i = 2 ■ p{q) ■ p{t) ■ p{v)^^^ <2 ■ m? ■ 
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Then, we obtain an NFA A iox L = \pu*s\i fl [qv*t]i with k ■ I states. We 
are only interested in the length of words from L. Hence, we replace in A 
every transition label by the symbol a. The resulting NFA H is defined over 
a unary alphabet. Let P = {n \ a"' G L{B)}. By [331 Theorem 1], the set P 
can be written as a union 

r 

P = [J{bi + a ■ z \ z G f^} 

i=l 


with r G 0{k'^i‘^) C 0{m^ ■ and bi,Ci G 0{k‘^£‘^) C 0{m^ ■ For 

every 1 < i < r and z G N there must exist a pair {x,y) G N x N such that 

bi + Ci ■ z = |ps| + \u\ ■ X = \qt\ + |u| • y. 


In particular, bi > |ps|, bi > \qt\, |m| divides bi — |ps| and Cj, and |u| divides 
bi — \qt\ and c*. We get: 


L{p,u,s,q,v,t) = IJ 


i=l 


bi-\ps\ Cj h - \qt\ Cj 

I I ' II ^ ’ II I I 

\u\ \u\ \v\ \v\ 


z G N 


This shows the lemma. 


□ 


7 Linear Diophantine equations 

We will also need a bound on the norm of a smallest vector in a certain kind 
of semilinear sets. We will easily obtain this bound from a result by Zur 
Gathen and Sieveking |44] . 

Lemma 12. Let A G a G TP, C G c G N^. Let f 3 be an upper 

bound for the absolute value of all entries in A, a, C, c. The set 

L = {CT + c I z G N™,AJ = a} C (1) 

is semilinear. Moreover, if L A ^ then L contains a vector with all entries 
bounded /3 + n! ■ m • (m + 1) • 

Proof. Semilinearity of L is clear since the set is Presburger-definable. For 
the size bound, we use a result by Zur Gathen and Sieveking [H] to bound 
the size of a smallest positive solution of the system Al = d. Let A G Z”^™, 

B G d G b G Z^^^. Let r = rank(A), and s = rank ■ Let 
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M be an upper bound on the absolute values of all (s — 1) x (s — 1)- or 


(s X s)-subdeterminants of the {n + p) x [m + l)-matrix 


B 


■j- ), which are 


formed with at least r rows from the matrix {A a). Then by the main result 
of [H], the system Al = a, Bz > b has an integer solution if and only if it 
has an integer solution z such that the absolute value of every entry of z is 
bounded by (m + 1)M. 

In our situation, we set p = m, B is the m-dimensional identity matrix, 
and b is the vector with all entries equal to zero (then Bz > b expresses 

that all entries of z are positive). Since ( ^ ) is an (n + m) x m-matrix we 


get s = rank 


< m. We claim that the absolute values of all (s x s)- 


subdeterminants (and also all (s — 1 ) x (s —l)-subdeterminants) of the matrix 
A a\ 

^ ^ j are bounded by To see this, select s rows and s columns from 

A a\ 

-r ) and consider the resulting submatrix D. Recall Leibniz’ formula 
B b J 

for the determinant (where Sg is the set of all permutations of { 1 ,..., s}): 
det(L>) = ^ sgn(cr) 


cr&Ss 


i=l 


Assume that the rows 1,..., si (si < n) of D are from the n x (m + 1)- 
submatrix (A, a). The remaining (s 2 := s — si many) rows si + 1,... ,s 
of D are from {B,b). If one of the rows si + 1,..., s of D only contains 
zeros, then det(D) = 0. Otherwise, since B is the identity matrix and b is 
the zero vector, each of the rows si + 1 ,..., s contains a unique 1 ; all other 
entries are zero. That means that every permutation a & Sg that gives 
a non-zero contribution to det{D) must take fixed values on si -|- 1, ..., s. 
For the values of a on the rows 1, ..., si, only si < n many values remain. 
Hence, at most si! < n! many permutations contribute a non-zero value to 
det(D). Moreover, every such contribution is bounded by (3^^ < /I”, which 
gives the bound n! • /3" on det(Z)). It follows that if Al = a has a positive 
solution, then it has a positive solution where every entry is bounded by 
(m -|- 1 ) • n! • /3”. 

By substituting every entry of z hy (m + l)n! • /3"’ in Cz + c, it follows 
that if the set L in ([1]) is non-empty, then it contains a vector with all entries 
bounded hy P + n\ ■ m ■ (m + 1) ■ □ 
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8 Exponent equations in graph groups 


The aim of this section is to prove the following two statements, where G is 
a fixed graph group: 

• The set of solutions of an exponent equation over G is (effectively) 
semilinear. 

• Solvability of compressed exponent equations over G belongs to NP. 

We start with some definitions. As usual, we fix an independence alphabet 
{A, I). In the following we will consider reduction rules on sequences of 
traces. For better readability we separate the consecutive traces in such a 
sequence by commas. Let ui,U 2 , ■ ■ ■ ,Un G IRR(A^^,/) be irreducible traces. 
The sequence ui, U 2 ,..., is I-freely reducible if the sequence ui,U 2 , ■ ■ ■ ,Un 
can be reduced to the empty sequence e by the following rules: 

• Ui,Uj —>■ Uj,Ui if Uiluj 

• Ui,Uj —>■ e if rxj = u~^ in G(A, I) 

• Ui ^ e Ui = e. 

A concrete sequence of these rewrite steps leading to the empty sequence 
is a reduction of the sequence ui,U 2 , ■ ■ ■ ,Un- Such a reduction can be seen 
as a witness for the fact that uiU 2 ---Un = 1 in G(A,/). On the other 
hand, uiU 2 - ■ ■ Un = 1 does not necessarily imply that ui,U 2 , ■ ■ ■ ,Un has a 
reduction. For instance, the sequence a~^ ,ab,b~^ has no reduction. But 
we can show that every sequence which multiplies to 1 in G can be refined 
(by factorizing the elements of the sequence) such that the resulting refined 
sequence has a reduction. For getting an NP-algorithm, it is important to 
bound the length of the refined sequence exponentially in the length of the 
initial sequence. 

Lemma 13. Let n>2 and ui,U 2 , ■ ■ ■ ,Un G IRR(A=*=^, I). If uiU 2 ■ ■ ■ = 1 

in G{A,I), then there exist factorizations Ui = tti,i • • • such that the 
sequence 

. . . , , rt2p, . . . , rt2jfc2 ) ••• i^n, it ••• lUn^kn 

is I-freely reducible. Moreover, ki <2"^ — 2. 

Proof. We prove the lemma by induction on n. The case n = 2 is trivial 
(we must have U 2 = uf^). If n > 3 then by Lemma 0] we can factorize 
ui and U 2 as ui = ps and U 2 = s~^t such that v := pt is irreducible. 
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Hence, vu 3 ---Un = 1 in G{A,I). By induction, we obtain factorizations 
pt = V = vi ■ ■ ■ Vk and Ui = vi^i ■ ■ ■ {3 < i < n) such that the sequence 

I’l) ■ ■ ■ ) Vk , 'U3q, • • • , '^3,^3 ) • • • ) ^n,l) • • • ) (^) 

is /-freely reducible. Moreover, 


n 

k + ^ki< 2^-^ - 2. 
i=3 

By applying Levi’s lemma to the identity pt = viV 2 • • -Vk, we obtain factor¬ 
izations Vi = Ui^iUi ^2 such that p = uiq • • • Ukp, t = ui ^2 ■ ■ ■ and Ui^ 2 luj,i 
for 1 < i < j < k. 

Fix a concrete reduction of the sequence ([2]). We now consider the fol¬ 
lowing sequence 

Ul,l, ■ ■ ■ ,Uk,l,S, S~^,Ul^2, ■ ■ ■ ,Uk, 2 , V3,l, ■ ■ ■ ,V3,k3, ■ ■ ■ ,Vn,l, ■ ■ ■ ,Vn,kn^ (3) 

where the subsequence Vij is if Vij cancels against vi in our fixed 

reduction of ([2|) (which, in particular implies that Vij = = uY^u^i)- 

Otherwise (i.e., if Vij does not cancel against any vi in our fixed reduction), 
we set Vij = Vij. 

Note that uip---Uk,is = ps = ui, s~^ui ^2 ■ ■ ■ Uk ,2 = s~^t = U 2 and 
the concatenation of all traces in ... ,Vi^ki is ttj for 3 < i < n. Hence, 
it remains to show that the sequence ([3|) is /-freely reducible. First of 
all, ..., s, s , 2? • • • ? ^k ^2 reduces to ■ ■ ■ ? ^/cq? ^1,2? • • • ■> ^fc,2; 

which can be rearranged to uiq, uiq, U2q, ^2,2 ,..., n^q, using the fact 
that Ujq/Ujq for 1 < i < j < /. Finally, the sequence 

y-l,lUl,2,U2,lU2,2, ■ ■ ■ , Uk,lUk^2, Vs^l, . . . , V^^ks, • ■ ■ , Vn,l, ■ ■ ■ , Vn,kn 

is /-freely reducible. The definition of Vij allows to basically apply the fixed 
reduction of (l2|) to this sequence. 

The number of traces in the sequence ([3]) can be estimated as 

n 

2 k+ 2 + 2 ■'^ki < 2 - (2’""^ - 2) -b 2 = 2” - 2. 

i=3 

This concludes the proof of the lemma. □ 

We now come to the main technical result of this paper. Let a < |H| be the 
size of a largest clique of the complementary graph (H, lY = (^j (^ ^ ^) \-^) • 
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Theorem 14. Let ui,U 2 , ■ ■ ■ ,Un € G(A, I) \ {1}, vo,vi,... ,Vn € G(A, I) 
and let xi,... ,Xn be variables (we may have Xi = Xj for i ^ j) ranging over 
N. Then, the set of solutions of the exponent equation 

X-\ XO Xn 1 

^1^2 • • • Un Vn = 1 

is semilinear. Moreover, if there is a solution, then there is a solution with 
Xi G 0{{an)\ ■ . ^8a(n+l) . j^8a|A|(n+l))^ 

• /i G 0(|A|“ • 22“"'^. A"), 

• G 0(A“), and 

• A = max{|tti|, |U2|, . . . , \Un\, |fo|, luil, • • • , It’nl}- 

Proof. Let us choose irreducible traces for ui,U 2 , ■ ■ ■ ,Un,vo,vi,... ,Vn', we 
denote these traces with the same letters as the group elements. A trace u 
is called cyclically reduced if there do not exist a G and v such that 
u = ava~^. For every trace there exist unique traces p,w such that u = 
pwp~^ and w is cyclically reduced (since the reduction relation a~^xa —>■ x 
is terminating and confluent). These traces p and w can be computed in 
polynomial time. Note that for a cyclically reduced irreducible trace w, 
every power is irreducible. By replacing every by PiwPp~^ with Ui = 
PiWipf^ and Wi cyclically reduced, we can assume that all Ui are cyclically 
reduced and irreducible. In case one of the traces Ui is not connected, we 
can write Ui as Ui = with Ui^ilui ^2 and rtjp 7^ 1 7^ Ui^ 2 - Thus, 

we can replace the power by u^\u( 2 - Note that rtjp and Ui ^2 are still 

irreducible and cyclically reduced. By doing this, the number n from the 
theorem multiplies by at most a (which is the maximal number of pairwise 
independent letters). In order to keep the notation simple we still use the 
letter n for the number of Ui , but at the end of the proof we have to multiply 
n by a in the derived bound. Hence, for the further proof we can assume 
that all Ui are connected, irreducible and cyclically reduced. Let A be the 
maximal length of one of the traces ui,U 2 , ■ ■ ■ ,Un,vo,vi,... ,Vn, which does 
not increase by the above preprocessing. 

We now apply Lemma [13] to the equation 

Voul^ViU2'^V2 ■ ■ ■ u(fVn = 1, (4) 

where every is viewed as a single factor. Note that by our preprocessing, 
all factors , ■ ■ ■, u^ ,vo,... ,Vn are irreducible (for all choices of the 

Xi). By taking a big disjunction over (i) all possible factorizations of the 
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2 n + 1 factors ..., vq, ..., Un into totally at most — 2 

factors and (ii) all possible reductions of the resulting refined factorization 
of vou^^viU 2^V2 ■ ■ ■ u^'^Vn, it follows that (jl]) is equivalent to a disjunction of 
statements of the following form: There exist traces yi^i ,..., {1 < i < n) 

and Zi^i,..., Zi^i^ (0 < i < n) such that 

(a) uf* = ■ ■ ■ Vi^ki (1 < i < n) 

(b) Vi = Zi^i ■ ■ ■ Zi^i^ (0 < i < n) 

(c) yi,jlyk,l for all {i,j,k,l) G Ji 

(d) yijlzk,i for all G J 2 

(e) Zijlzk,i for all {i,j,k,l) G J3 

(f) yi,j = ykj fo’^ all iiJ,k,l) G Ml 

(g) yi,j = \ j for all ii,j,k,l) G M 2 

(h) Zij = z^j for all {i,j,k,l) G M3 

Here, the numbers ki and li sum up to at most 2^”+^ — 2 (hence, some ki can 
be exponentially large, whereas /j can be bound by the length of Uj, which 
is at most A). The tuple sets Ji,J 2 ,J 3 collect all independences between 
the factors Uij, Zk,i that are necessary to carry out the chosen reduction of 
the refined left-hand side in Q . Similarly, the tuple sets Mi, M2, M3 tell 
us which of the factors Zk^i cancels against which of the factors y^j, Zk,i 
in our chosen reduction of the refined left-hand side in (I3|. Note that every 
factor Dij (resp., Zk,i) appears in exactly one of the identities (f), (g), (h) 
(since in the reduction every factor cancels against another unique factor). 

Next, we simplify our statements. Since the Vi are concrete traces (of 
length at most A), we can take a disjunction over all possible factorizations 
Vi = Vi^i ■ ■ ■ Vi^i^ (1 < i < n -|- 1). This allows to replace every variable Zij by 
a concrete trace Vij. Statements of the form Vijlvk,i and Vij = v'^\ can, of 
course, be eliminated. Moreover, if there is an identity y^j = v'^\ then we 
can replace the variable y^j by the concrete trace v'^\ (of length at most A). 

In the next step, we replace statements of the form = y^^i • • • yj^^. 
(1 < i < n). Note that some of the variables y^j might have been replaced 
by concrete traces of length at most A. We apply to each of these equations 
Lemma [6l or better Remark [71 This allows us to replace every equation 
■wf* = yi,i • • • yi^ki (1 < * < by a disjunction of statements of the following 
form: There exist numbers Xij > 0 (1 < f < n, j G Ki) such that 
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• Xi = Ci + Y^j^Ki ^i,j all 1 < i < n, 

• UiJ = Sij for all 1 < z < n, j G 

• yi,j = Pi,jSi,j for all 1 < z < n, j G [1, ki] \ Ki. 

Here, Ki C [l,ki\, the Cj are concrete numbers with Ci < |j4| ■ (ki — 1), 
and the pij,Sij are concrete traces of length at most |j4| • {ki — 1) • |ztj| < 
|H| • — 3) • A. Hence, the length of these traces can be exponential in 

n. 

Note that since Xi > 0, we know the alphabet of z/jj = Piju^’’'^ Sij (resp., 
Uij = Pi,j^i,j)- This allows us to eliminate all independences of the form 
yi,jlyk,i for {i,j,k,l) G Ji (see (c)) and yijlzk,i for {i,j,k,l) G J2 (see (d)). 
Note that all variables have already been replaced by concrete traces. If 
z/ij- was already replaced by a concrete trace, then we can determine from 
an equation z/j^- = Piju^’’’^ Sij the exponent Xij. Since z/jj was replaced by 
a trace of length at most A (a small number), we get Xij < A, and we can 
replace Xij in Xi = YjeKi + cz by a concrete number of size at most A. 
Finally, if yij was replaced by a concrete trace, and we have an equation of 
the form z/jj = pijSij, then the resulting identity is either true or false and 
can be eliminated. 

After this step, we obtain a big disjunction of statements of the following 
form: There exist numbers Xij > 0 (1 < z < n, j G K'-) such that 

(a’) Xi = Ci + Yj^K' all 1 < z < n, and 

(b’) pijuY'sij = for all {i,j,k,l) G M. 

Here, iF' C ATj is a set of size at most ki < 2^”+^ —2, Cj < \A\-{ki — l)+X-ki < 
(|A| + A) • (2^"-+^ — 2), and the pij, Sij are concrete traces of length at most 
|A| • (2^”+^ — 3) • A. The set M specifies a matching in the sense that for 
every exponent Xa^b (1 < a < zz, 6 G iF') there is a unique {i,j,k,l) G M 
such that {i,j) = {a,h) or {k,l) = (a, 6). Note that 

n n 

I vi = 2 E s 2 E s 2 - 2 ) = 2"” -1. 

i=l i=l 

We now apply Lemma [TTl to the identities Piju^^’^ Sij = J(% 

Each such identity can be replaced by a disjunction of constraints 


G {{(lij^k,l T bij^k^l • Zij^k^l,Cij^j«j T dijjfcjZ ■ ^i,j,k,l) \ ^i,j,k,l £ N}. 
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For the numbers aij^k,i,bi,j,k,i, Cij^k,i, dij^k,i we obtain the bound 
^i,j,k,h bij^k,h Ci j k^h dij,k,l ^ ■ U ^ 

(the alphabet of the traces is which has size 2\A\, therefore, we have to 
multiply in Lemma[TT] |A| by 2), where, by LemmaO 

/r = max{p(pij), P('Sij), p('Sfc,/)} G 0(1^1" ' 2^"” ' ^“) (5) 

and 

V = max{p(ui), p(ufc)} G 0(A"). (6) 

Note that p{t) = p{t~^) for every trace t. The above equation (a’) for Xi can 
be now written as 

Q “1“ ^ ^ ^ ^ “1“ * Z^k,l,i,j)‘ 

{iJ,k,l)£M (k,l,i,j)£M 

Note that the two sums in this equation contain in total \Kl\ < 22^+1 
many summands (since for every j G K'- there is a unique pair {k,l) with 
{i,j,k,l) G M or {k,l,i,j) G M). 

Hence, after a renaming of symbols, the initial equation Q becomes 
equivalent to a finite disjunction of statements of the form: There exist 
zi,..., G N (these Zi are the above Zij^k,i and m = \M\) such that 

m 

Xi — 0-2 “h Zj for all 1 < i < n. (7) 

i=i 

Moreover, we have the following size bounds: 

• m = |M| < 2^"' — 1, 

• a* G 0{ci + \K'\-p^-u^\^\) C 0(22”(|^|+A+/r8-z/8|^l)) C 

• Qip G 0{p^ ■ 

Recall that some of the variables Xi can be identical. W.l.o.g. assume that 
xi,... ,Xk are pairwise different and for all A; + 1 < z < n, Xi = Xf(ip where 
/ : [/c + 1, n] —)■ [1, A:]. Then, the system of equations ([7]) is equivalent to 

m 

Xi = ai + o^i^jZj for all 1 < z < A; 
i=i 

m 

ai — Oj(j) = ^^(a/(i),j “ o,i,j)^j for all A: + 1 < z < n. 
i=i 
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The set of all (xi,..., Xk) G for which there exist zi,... ,Zm G N satisfying 
these equalities is semilinear by Lemma WR and if it is non-empty then it 
contains a vector (xi,..., Xk) € such that 

Recall that in this bound we have to replace n by a • n due to the initial 
preprocessing. This proves the theorem. □ 

Theorem 15. Let {A, I) be a fixed independence alphabet. Solvability of 
compressed exponent equations over the graph group G{A,I) is in NP. 

Proof. Consider a compressed exponent equation 

E = {voul^ViU 2 ^V 2 ■ ■ ■ uffiVn = 1), 

where Ui = val(^i) and Vi = val('Hi) for given SLPs f/i,..., Qn, LLo, ■ ■ ■, LLn- 
Let m = max{|^i|,..., \Qn\, \Tdo\, ■ ■ ■, I’^nl}- By Theorem [TTl we know that 
if there exists a solution for E then there exists a solution a with <T(xi) G 
0{{an)\ ■ . j^8a|A|(n+l))^ 

• /r € 0(|A|“ • . A"), 

• 0(A“), 

• A = max{|rti|, |rt 2 |, • • •, \un\, |xo|, |xi|,..., |xn|} G and 

• a < 1^41. 

Note that the bound on the a{xi) is exponential in the input length (the sum 
of the sizes of all Gi and PLi). Hence, we can guess in polynomial time the 
binary encodings of numbers ki G 0{{an)l ■ 22"^"'(’^+3). ^8a(n+i ). ^&a\A\(n+i)^ 
(where ki = kj if Xj = Xj). Then, we have to verify whether 

val(?^o)val(^i)^^val(?^i)val(^2)^^val(?^2) ■ ■ ■ val(^n)^"'val(?^n) = 1 

in the graph group G{A,I). This is an instance of the so called compressed 
word problem for G(H,/), where the input consists of an SLP G over the 
alphabet A^^ and it is asked whether val(^) = 1 in G{A,I). Note that 
the big powers val(^j)^* can be produced with the productions of Gi and 
additional [logfcj] many productions (using iterated squaring). Since the 
compressed word problem for a graph group can be solved in deterministic 
polynomial time [3I1I32], the statement of the theorem follows. For the last 
step, it is important that {A, I) is hxed. □ 
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Remark 16. Note that the bound on the exponents a{xi) in the previous 
proof is still exponential in the input length if the independence alphabet 
{A, I) is part of the input as well. The problem is that we do not know 
whether the uniform compressed word problem for graph groups (where the 
input is an independence alphabet (^4, 1) together with an SLP over the ter¬ 
minal alphabet can be solved in polynomial time or at least in NP. 

The latter would suffice to get an NP-algorithm for solvability of compressed 
exponent equations over a graph group that is part of the input. 

9 Transfer results 

In this section, we show that the property of having an NP-algorithm for the 
knapsack problem (or compressed exponent equations) is preserved by cer¬ 
tain transformations on groups. Specifically, we show that the class of groups 
that admit an NP-algorithm for knapsack is closed under (i) finite extensions, 
(ii) HNN-extensions with finite associated subgroups, and (hi) amalgamated 
free products with finite identified subgroups. In the case of finite extensions, 
the transfer also holds for compressed exponent equations. 

Finite extensions and virtually special groups. Our first transfer 
result concerns finite extensions. Together with our result on graph groups, 
this will provide a large class of groups with an NP-algorithm for compressed 
exponent equations. A group G is called virtually special if it is a finite 
extension of a subgroup of a graph group. Recently, this class of groups 
turned out to be very rich. It contains the following classes of groups: 

• Coxeter groups m 

• one-relator groups with torsion [JS] 

• fully residually free groups [35] 

• fundamental groups of hyperbolic 3-manifolds [I] 

The following is our transfer theorem for finite extensions. 

Theorem 17. Let G and H be finitely generated groups such that H is a 
finite extension ofG. If knapsack (resp. solvability of compressed exponent 
equations) belongs to NP for G, then the same holds for H. 

From Theorem [15] it follows that solvability of compressed exponent equa¬ 
tions belongs to NP for every subgroup of a graph group. Therefore, our 
transfer theorem implies: 
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Theorem 18. Solvability of compressed exponent equations belongs to NP 
for every virtually special group. In particular, solvability of compressed 
exponent equations belongs to NP for Coxeter groups, one-relator groups with 
torsion, fully residually free groups, and fundamental groups of hyperbolic 3- 
manifolds. 

We need the following statement, which is shown implicitly in the proof of 
1311 proof of Theorem 4.4], 

Lemma 19. Let G and H be finitely generated groups such that H is a finite 
extension of G and let G be a set of right coset representatives of G. Let 
A (resp. BAA) be a finite generating set for G (resp., H). From a given 
SLP FL over the terminal alphabet one can compute in polynomial time 
(i) the unique coset representative c € G such that val('H) € Gc and (ii) an 
SLP Q over the terminal alphabet A^^ such that val(^)c = val('H) holds in 
the group H. 

Proof of Theorem \n\ In [28], it was shown that for each finitely generated 
group, the knapsack problem and the solvability of exponential expressions 
where each variable occurs only once (the latter is called generalized knap¬ 
sack problem there) are polynomially inter-reducible. Therefore, we shall 
prove that exponential expression over H can be reduced to exponential ex¬ 
pressions over G. Moreover, the reduction preserves the property that each 
variable occurs only once. We only describe the case that all inputs are 
uncompressed; by means of Lemma fT9l the compressed case can be treated 
analogously. 

Assume that [H : G] = m and let C be a set of coset representatives, 
\C\ = m. Let A be a finite generating set for G and let i? 13 A be a finite 
generating set for H. Suppose we are given an exponent equation 

Voul^Vi-■ ■u')fVn = l ( 8 ) 

in H where the Vi and the Ui are represented as words over B^^. As a first 
step, we guess which of the variables Xi assume a value smaller than m. For 
those that do, we can guess the value and merge the result in a neighboring 
Vi- This increases the size of the instance by at most a factor of m, which 
is a constant. Hence, from now on, we only look for solutions to ([8|) where 
Xi > m for 1 < i < n. 

The next step of our NP algorithm is to guess the cosets occurring in 
a solution. This means, we guess dQ,ci,di,... ,Cn,dn € G and look for a 
solution to dSj) such that vou^^vi ■ ■ ■ uf'Vi G Gdi and vqUi^vi ■ ■ ■ € Gci 
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for 0 < i < n. This is equivalent to a solution where the elements 
uoho\ CiVid~^, dn 

all belong to G for 1 < i < n. We can verify in polynomial time that vod^^, 
CiVid~^ (1 < * < u), and d^ belong to G. Therefore, we want to check 
whether there is a solution to Q where di-iu^''c~^ G G for 1 < z < n. 

Consider the function fi'. G —?> G, which is defined so that for each 
c G G, fi{c) is the unique element d G G with cuid~^ G G. Note that we can 
compute fi in polynomial time. Then there are numbers 1 < ki < m such 
that fl^^^'{di-i) = With this notation, we have di-iu^'c~^ G G 

if and only if /f*(di_i) = c*. 

We may assume that there is an Xi > m with = q: Otherwise, 

there is no solution and we can terminate our branch. Therefore, there is a 
0 < Vi < ki such that = c*. This means, we have = q 

for Xj > m if and only if Xj = m + fej • ?/j + rj for some i/i > 0. This allows 
us to construct an exponent equation over G. 

Let ei = /™(dj_i). Then, the elements eiU^*e~^, and eiU^cG^ 

all belong to G. Moreover, for Xj = m + ki ■ yi + Vi, we have 

n 

Voul'^Vi ■ ■ ■ = Vod^^ 

i=l 

n 

i=l 

and each term in parentheses belongs to G. This clearly yields an exponent 
equation over G (with variables yi,...,yn) that is solvable if and only if 
there is a solution of ([8]) of the kind we are looking for. It remains to verify 
that the new instance is polynomial in size. 

There is a constant i such that given a word w representing h G H 
and elements c,d G G such that chd~^ G G, a word of length at most i ■ |t(;| 
representing chd~^ over is computable in linear time. Let Si,tj G 
represent Vi and Uj, respectively, for 0 < i < n and 1 < j < n. Then, the 
new instance has size at most 

n 

^I'Sol + ^^(m +/Cj + ri)\ti\ < 3m£{\so\ + |ti| + |si| H- \tn\ + |sn|) 

i=l 

which is linear in the size of the old instance. □ 
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HNN-extensions and amalgamated products. The remaining trans¬ 
fer results concern two constructions that are of fundamental importance in 
combinatorial group theory [36], namely HNN-extensions and amalgamated 
products. In their general form, HNN-extensions have been used to con¬ 
struct groups with an undecidable word problem, which means they may 
destroy desirable algorithmic properties. We consider the special case of 
finite associated (resp. identified) subgroups, for which these constructions 
already play a prominent role, for example, in Stallings’ decomposition of 
groups with infinitely many ends |l2] or the construction of virtually free 
groups [To]. Moreover, these constructions are known to preserve a wide 
range of important structural and algorithmic properties [2] [H [201 [23[ [24l 

[261127113311M1EZ]. 

Suppose G = (S I i?) is a hnitely generated group that has two iso¬ 
morphic subgroups A and B with an isomorphism ip: A ^ B. Then the 
corresponding HNN-extension is the group 

H = {G,t I t~^at = ip{a) (o G H)), 

where t is a new letter not contained in G. In other words, H is the group 
H = {T,U {t} I R U {t~^at = (p(a) | a G H}) with t ^ T,. Intuitively, H is 
obtained from G by adding a new element t such that conjugating elements 
of A with t applies the isomorphism ip. Here, t is called the stable letter 
and the groups A and B are the associated subgroups. A basic fact about 
HNN-extensions is that the group G embeds naturally into H m- 

Here, we only consider the case that A and B are finite groups, so that we 
may assume that A U H C E. To exploit the symmetry of the situation, we 
use the notation A(-|-l) = A and A(—1) = B. Then, we have ip°‘: A[a) 
A{—a) for a G {-|-1, —1}. By h: U —)■ iT, we denote the 

canonical morphism that maps each word to the element of H it represents. 

A word u G (E^^ U {t, is called reduced if it does not contain a 

factor t~°‘wt°‘ with a G {—1,1}, w G (E=*=^)*, and h{w) G A{a). Note that 
the equation t~^at = ip{a), a G A, allows us to replace such a factor t~°^wt°‘ 
by ip°‘{h{w)) G A(— a) C E. Since this reduces the number of t’s in the 
word, this allows us to turn every word into an equivalent reduced word. 
The following well-known fact describes the reduced words representing the 
identity m Lemma 5]. 

Lemma 20. If u £ (E^^ U is a reduced word representing 1 £ H, 

then u £ (E=*=^)*. 

Our algorithm for knapsack in HNN-extensions is an adaptation of the 
saturation algorithm of Benois [3] for the membership problem for rational 
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subsets of free groups. Here, for each path spelling aa~^, one adds a parallel 
edge labeled with the empty word. Since knapsack is a special case of this 
problem, we have to use a suitable subclass of automata that is preserved 
by our saturation and corresponds to the knapsack problem. 

Let G be a group with finite generating set S. A finite automaton over 
G is an NFA A = {Q,Tj^^,A,qo,F). A (directed) cycle in A is a sequence 
pi,... ,Pn of states such that there are edges {pi, ai,pi+i) for 1 < i < n — 1 
and {pn,o-n 7 Pi) with ai,...,an € In particular, a single state with 

a loop is regarded as a cycle. A sequence pi,... ,pn is an induced cycle if 
it is a cycle and there are no other edges among the states pi,... ,pn- We 
call A a knapsack automaton if every strongly connected component of A 
is a singleton or an induced cycle. The membership problem for knapsack 
automata over G is the following decision problem: 

Input: A knapsack automaton A over G and a word w € 

Question: Does A accept a word w' that represents the same element of G 
as w7 

Indeed, the membership problem for knapsack automata corresponds pre¬ 
cisely to the knapsack problem in the following sense. 

Lemma 21. For each finitely generated group, knapsack belongs to NP if 
and only if membership for knapsack automata belongs to NP. 

Proof. It is easy to turn a knapsack instance into a knapsack automaton: 
Given words wi,... ,Wk,w G one can clearly construct a knapsack 

automaton accepting w*---w^. Then, the knapsack problem amounts to 
deciding the membership problem for w. 

Now, suppose we are given a knapsack automaton A over G and a word 
w G We can clearly turn A into a knapsack automaton that first 

reads w~^ and then behaves like A. Therefore, it suffices to solve the mem¬ 
bership problem in the case that w = e. 

Consider a run r in .4 from the initial to a final state. Let ci,..., be 
the sequence of strongly connected components it visits. For each Cj that 
is not a singleton, let pi and qi be the state where r enters and leaves Cj, 
respectively. We call the sequence ci,..., c^, together with the pi and qi the 
skeleton of r. 

Our algorithm guesses a skeleton. Since ^ is a knapsack automaton, 
from this skeleton, we can determine words vo,ui,vi,... ,Un-,Vn G 
such that vqu\vi ■ ■ ■ u'fVn is precisely the set of words labeling a path with 
this skeleton. Hence, deciding the membership problem for A amounts to 
checking whether there are xi,..., G N with hagfi'^hi ■ ■ ■ gfphn = 1, where 


31 



gi {hj, respectively) is the element represented by Ui [vj, respectively). This 
is an exponential equation with pairwise distinct variables and the solvability 
of such equations is called the generalized knapsack problem in [28], where 
it was shown to be polynomially inter-reducible with the knapsack problem. 

□ 


Theorem 22. Let H be an HNN-extension of the finitely generated group 
G with finite associated subgroups. If knapsack for G belongs to NP, then 
the same holds for H. 

Proof. According to Lemma EH it suffices to prove that if membership for 
knapsack automata over G belongs to NP, then the same holds for H. Hence, 
let ^ be a knapsack automaton over H. As explained above, it suffices to 
check membership for the group identity, i.e., to check whether A accepts a 
word from 

The basic idea of the proof is to saturate A, yielding a knapsack au¬ 
tomaton that is saturated, meaning: For each path from p to q labeled with 
a word t~°‘wt°‘ with h{w) € A{a), there is an edge from p to q labeled with 
ip°‘{h{w)) € A{—a). We will then show that A accepts a word from h~^{l) 
if and only if it accepts a word from h~^{l) n (S^^)*. This will allow us to 
remove all t^^-edges and apply the algorithm for G. A path in a knapsack 
automaton that is labeled by a word t~°‘wt'^ with w G (S^^)* D h~^{A{a)) 
is called a reduction path. Among other things, the algorithm will introduce 
a shortcut edge for the reduction path, namely 


P 


-^ q, 


(9) 


where a = h{w) G A{a). Observe that p°‘{a) G A{—a) and ¥?“(a) = 
h{t~°‘wt°‘). By introducing intermediate states, we may assume that (i) there 
is no edge between states that belong to distinct cycles and (ii) the initial 
and the final state do not lie on a cycle. 


Phase 1. The saturation proceeds in two phases. In the first phase, we 
saturate the directed cycles, which are the strongly connected components. 
This means, we modify the automaton so that there is no reduction path 
between two states on a cycle. This is done as follows. We successively guess 
tuples (p, a, a, q) where p and q are states from the same cycle, a G {—1,1}, 
and a G A{a). Then, employing the NP algorithm for G, we can clearly 
verify that there is a reduction path spelling t~°‘wt°^ from p to q with w G 
(j]±i)* h{w) = a. Note that on this path, the first letter (t“") occurs 
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only once, meaning the path visits each state at most once (i.e. it makes at 
most one round in the cycle). Let 

, Un /1 n\ 

p = ro —^ ri- >rn = q (10) 


be the reduction path and let 

^n+1 

q = rn -^ rn +1 ■ ■ ■ —^ rm = ro=p 

be the rest of the cycle with ui,..., Um G U {t, and ui - ■ - Un = 
In particular, m is the length of the cycle. Let us now describe the 
saturation step. We remove all edges from (fTU|l and all states incident to 
them, except for p and q. Instead, we add a shortcut edge ([9]). For each state 

s not on the cycle and for which there is an edge (s, v, ri), 1 < z < n — 1, we 

glue in a path 

S ^ 50 -^ 5i • • • ^ = g, (11) 

where sq, ... are new states. Analogously, for each state s not on 

the cycle and for which there is an edge (r*, n, s), 1 < z < n — 1, we glue in 
a path 

ni Ui V 

P = So -^ Si •••—)■ Sj ^ S, (12) 

where si,...,Si are new states. Moreover, for each pair (s,s') of states 
not on the cycle and for which there are edges {s,v,ri) and {rj,v',s') with 
1 < i < j < n — 1, we glue in a path 


V 

S ^ So 


Ui+1 

-^ Si••• 



(13) 


where sq, • • •, sj-i are new states. This completes our saturation step. 

Let A' be the automaton resulting from one saturation step from A. 
Then, A! is clearly a knapsack automaton: We only connect states that 
were connected before. Moreover, for states s, s' that exists in A and in A!, 
the set of group elements represented on paths from s to s' does not change. 
Indeed, a path that avoids our cycle still exists. A path that involves the 
whole path (fTO)) can use the shortcut edge ([9]). A path that either (i) enters 
(IlOp after p and follows it until q or (ii) follows (|10p partly and then leaves 
before q can use the new paths m or m, respectively. Finally, a path 
that follows only a part of (fTOll that starts after p and ends before q can use 
the new path (fT^ instead. 

Let us estimate the number of added states during Phase 1. The degree 
of a cycle is the number of edges entering or leaving the cycle. Let d be the 
degree of our cycle. Let us first consider a single saturation step. The new 


33 





states of type dm) or (1121) are each at most d-n many. The new states of type 
m are at most d^-n many. Hence, we add at most (d^ + 2d)n < (d^ + 2d)m 
states in this saturation step. Observe that in this step, the length of the 
affected cycle decreases has length > 2 and € A{—a) 

has length 1) and its degree is unchanged (the new edges from (fTT]l and 
(fT^ clearly preserve the degree and those of (fT^ do not increase the degree 
because by our assumption that no edge connects two cycles, s and s' do 
not belong to a cycle). Now, we consider the whole phase. Suppose in the 
beginning, A has c cycles of maximal degree d and maximal length i. Then, 
each saturation step adds at most (d^ + 2d)i states. Moreover, there can be 
at most £-c saturation steps, so that the first phase adds at most (d^ + 2d)^^c 
states, which is polynomial in the size of the input automaton. 

Phase 2. In the second phase, we consider reduction paths between states 
that belong to distinct strongly connected components. Since here, adding 
an edge that runs parallel to the reduction path cannot violate the property 
of being a knapsack automaton, we may saturate by simply introducing new 
edges. 

Again, we successively guess tuples {p,a,a,q) where a G {—1,1}, and 
a € A{a). However, we require that p and q are not from the same strongly 
connected component and that there is no shortcut edge Q yet. As above, 
we employ the NP algorithm for G to verify that there is a reduction path 
spelling from p to q with w G and h{w) = a. Then, we add 

the shortcut edge Q. As before, we have = 9 ?“ (a) G A{—a). 

This is all we do in the saturation step. Since now, we only add edges (and 
no states) and each correct guess leads to an increase in the number of edges, 
our sequence of saturation steps must terminate after a polynomial number 
of steps. This concludes the second phase and thus the saturation. 

Finally, the algorithm applies the NP-algorithm for G. More precisely, we 
remove all edges labeled This yields a knapsack automaton over G, 

so that we can use the algorithm for G to check whether it accepts 1 G G. 
Then, we answer “yes” if and only if the algorithm for G does. 

It remains to be shown that this algorithm is sound and complete. If we 
answer “yes”, then the input automaton accepts 1 G FI. This is because each 
saturation step preserves the set of accepted elements. On the other hand, 
suppose the input automaton A accepts 1 € H and consider the branch of 
our nondeterministic algorithm that guesses in such a way that in the end, 
there are no more reduction paths without a shortcut edge. Let B be the 
resulting saturated knapsack automaton. Since A accepts 1 £ H, there is 
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an accepting run in B that accepts 1 € Consider such a run reading a 
word u € U {t, with a minimal number of occurrences of t. Since 

B is saturated, this implies that u is reduced: Otherwise, u would have a 
factor with w € and h{w) G A{a). This factor, however, 

lies on a reduction path and we could have used the shortcut edge instead, 
which would result in a run with fewer t’s. Since u is reduced and represents 
1 G -ff, it contains neither t nor (Lemma I20p . Hence, our application of 
the algorithm for G answers “yes” because of u. □ 

In our last transfer theorem, we consider amalgamated free products. For 
i G {0,1}, let Gi = (Sj I Ri) be a finitely generated group and let F 
be a finite group that is embedded in each Gi, meaning that there are 
injective morphisms (fi'- F Gi for i G {0,1}. Then, the free product 
with amalgamation with identified subgroup F is defined as 

Go Gi = (Go * Gi I Mf) = Mf) if e F)). 

Here, Go * Gi denotes the free product Go * Gi = (Sq l+l Si | i?o W .^i)- Note 
that the product depends on the morphisms (fi, although they are omitted 
in the notation Gq Gi. Equivalently, Gq *_f Gi is given by the presentation 

(So w Si I i? a 5 u {Mf) = Mf)\f^ F})- 

Let us consider the free product Gq * Gi. Let h: (S^^ U Sj*^^)* —>■ Gq * 
Gi be the canonical morphism that maps a word to the group element it 
represents. If re G (S^^ U Sj*^^)*, then a syllable of w is a factor of w that is 
contained in (S^^)^ U and that is maximal with this property. The 

definition of the free product immediately implies the following. 

Lemma 23. If in the free product Gq * Gi, a word represents 1 G Gq * Gi, 
then it contains a syllable s with h{s) = 1. 

The transfer theorem states that taking amalgamated products with 
finite identified subgroups preserves NP membership of knapsack. 

Theorem 24. Let Gq and Gi be finitely generated groups with a common 
finite subgroup F. If knapsack for Gq and for Gi belongs to NP, then the 
same holds for the amalgamated product Gq Gi. 

Proof. It is well-known [Ml Theorem 2.6, p. 187] that Go Gi can be 
embedded into the HNN-extension 

/ = (Go * Gi,t I t-Vo(/)t = Mf) if e F)) 
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by way of the morphism : Gq*f Gi —>■ / with 




t ^gt if € Go 
g iffiieGi. 


Since Theorem 1221 already tells us that NP membership of knapsack is pre¬ 
served by HNN-extensions with finite associated subgroups, it suffices to 
show that free products preserve NP membership. 

We use a slight modification of the nondeterministic algorithm from the 
proof of Theorem [22] and show that if membership for knapsack automata 
belongs to NP for Go and Gi, the same holds for Go * Gi. During the 
saturation, we maintain the following invariants: 


(i) There is no edge between states that belong to distinct cycles. 

(ii) The initial and the hnal states do not lie on a cycle. 

(in) Every edge entering a cycle is labeled with the empty word e. 


By introducing intermediate states, we can clearly achieve them in the be¬ 
ginning. As in the proof of Theorem 1221 we add shortcut edges for reduction 
paths. For states p and q, a reduction path (from p to q) is a path labeled 
by a word w G for some i € {0,1} such that (a) h{w) = 1 and 

(b) if p and q lie on a cycle, then this cycle also contains a letter in 
Here, we need the additional condition (b) to make sure that short-cutting 
a reduction path actually reduces the cycle (Without requiring (b), it could 
happen that a reduction path occupies more than one round of a cycle.) A 
shortcut edge is then simply {p,s,q). 

Again, our saturation consists of two phases and in the first one, we 
shortcut reduction paths inside of cycles. We guess tuples {p,i,q) such that 
p and q lie on a cycle and i G {0,1}. Using the NP-algorithm for Gj, we 
verify that there is a reduction path from p io q labeled with w G 
Then, we proceed as in the proof of Theorem [22] and replace the reduction 
path with a shortcut edge and add new paths almost as in (fTTI) . (fT^ . and 
(fT^ : The only difference is that the new paths of type (fTTI) are are prolonged 
with an e-edge at the end so as to preserve invariant (iii). 

While in the proof of Theorem \2^ the length the cycle decreases in a 
saturation step, this is not guaranteed here. This is because in the proof 
of Theorem [ 22 I we always remove edges labeled t and t~^. Here, it could 
happen that the reduction path consists of one edge labeled a G with 
h{a) = 1. Then, the length of the cycle is unchanged. We do, however, re¬ 
duce the number of letters on the cycle. Therefore, an analogous estimation 
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of the number of introduced states applies and shows that it is polynomially 
bounded. 

The second phase works just as for Theorem!^ We guess triples (p, i, q) 
such that p and q are not in the same strongly connected component but 
there is no shortcut edge (p, e, q) yet. Then, we verify that there is a reduc¬ 
tion path from p to g with label w € If this is the case, we add a 

shortcut edge {p,e,q). 

In the end, we guess z G {0,1} and verify, using the NP-algorithm for Gi, 
that the automaton, restricted to accepts a word representing 1 € Gj. 

Let us show that this algorithm is sound and complete. As above, we 
can argue that if it answers “yes”, then the input automaton clearly accepts 
1 € Gq * Gi- For the completeness, we have to argue slightly differently. 
Suppose the input automaton accepts a word representing 1 G Gq * Gi. We 
consider a branch of the nondeterministic algorithm that saturates every 
reduction path. Let B be the resulting automaton. Since B also accepts a 
word representing 1 G Gq * Gi, we consider such a word w G U 
with a minimal number of syllables. 

Suppose w has more than one syllable. By Lemma [23l it contains a 
syllable s G with h{s) = 1. Consider the accepting run r for w and 

let p and q be the states occupied before and after reading s. The path 
taken by r from p to g is not a reduction path, because otherwise we could 
have taken a shortcut edge instead, in contradiction to the minimality of 
w. This means, p and q lie on a cycle that contains only letters in 
Since s is a syllable, this implies that r enters this cycle at p. Let p' be the 
state occupied in r directly before p: Note that r cannot start in p because 
of invariant (ii). Because of invariant (iii), the edge from p' to p is labeled 
with e. Thus, the path taken by r from p' to g is a reduction path, again 
contradicting the minimality of w. 

Hence, w has at most one syllable, which means w G for some 

j G {0,1} and our application of the NP-algorithm for Gj answers “yes”. □ 

10 Hardness results 

Since knapsack for binary encoded integers is NP-complete, it follows that 
the compressed knapsack problem is NP-hard for every group that contains 
an element of inhnite order. In this section, we prove that (uncompressed) 
knapsack and subset sum are NP-complete for a direct product of two free 
groups of rank at least two. This solves an open problem from [16] . 

With T(S) we denote the free group generated by the set S. Moreover, 
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let F 2 = F{{a,b}). 


Theorem 25. The subset sum problem and the knapsack problem are NP- 
complete for F 2 x F 2 . For knapsack NP-hardness already holds for the vari¬ 
ant, where the exponent variables are allowed to take values from Z (see 
RemarkU^. 

Proof. In [30] it was shown that there exists a fixed set D C F 2 x F 2 such 
that that the following problem (called the bounded submonoid problem) is 
NP-complete: 

Input: A unary encoded number n (i.e., n is given by the string a”) and 
an element g & F 2 x F 2 

Question: Do there exist gi,... gn & D (not necessarily distinct) such that 
g = gig 2 ■■■gn in F 2 x F 2 ? 

Let us briefly explain the NP-hardness proof, since we will reuse it. We start 
with a finitely presented group (S, R) having an NP-complete word problem 
and a polynomial Dehn function. Such a group was constructed in [7]. To 
this group, the following classical construction by Mihailova [38] is applied: 
Let 

D = {{R, 1) I r G i?, e G {—1,1}} U {(a, a) | a G 

which is viewed as a subset of F{'E) x F{'E). Note that D is closed under 
taking inverses. Let {D) < T(S) x F{T,) be the subgroup generated by D. 
Mihailova proved that for every word w G (S^^)* the following equivalence 
holds: 

in = 1 in (S, R) ^ {w, 1) G {D) in F{R) x F(S). 

Moreover, based on the fact that has a polynomial Dehn function 

p{n), the following equivalence was shown in [JOj, where q{n) = p{n) -p 8(c ■ 
p{n) -pn), c is the maximal length of a relator in R, and D” is the set of all 
products of n elements from D: 

in = 1 in (S, R) ^ 3n< g(|in|) : (in, 1) G D" in F(S) x F(S). 

From these two equivalences it follows directly that the following three state¬ 
ments are equivalent for all words w G (S^^)*, where D = {< 71 , 52 , • • • 

• in = 1 in (S, R) 

• {w. 1) = • • • 9T) in F(S) x F(S) for a,, G (0,1} 

• {w, 1) = ■ ■ • 5^") in x ^ ^ 
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This shows that the subset sum problem and the knapsack problem are 
NP-hard for the group T(S) x T(S), where for knapsack we allow integer 
exponents. To get the same results for F 2 x F 2 , we use the fact that F 2 
contains a copy of T(S). □ 
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